[1] 2
Dec 13, 2022
Some of you might have access to the Schmitzlab BCCN server and use RStudio in the browser on the server
just ask for the link
You can start right away!
Output and return itBut the worst enemy you can meet will always be yourself.
– Friedrich Nietzsche
SD_cellssdCellsSdCellsparameter as pari, n|> or %>% which are a bit like a water slide |> or %>% which are a bit like a water slide
You jump (|>) in there → ()integer (1L, 1:10)double (1.1, pi)complex (1+2i)logical (TRUE, FALSE)character ("hello", LETTERS)factor (factor(x = LETTERS))vectormatrix/arraylistdata.frame/tibble/data.table [1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" "l" "m" "n" "o" "p" "q" "r" "s"
[20] "t" "u" "v" "w" "x" "y" "z"
[1] a b c d e f g h i j k l m n o p q r s t u v w x y z
Levels: a b c d e f g h i j k l m n o p q r s t u v w x y z
Return the First or Last Parts of an Object
Returns the first or last parts of a vector, matrix, table, data frame or function. Since head() and tail() are generic functions, they may also have been extended to other classes.
A matrix is a vector with 2 dimensions
[,1] [,2]
[1,] 1 3
[2,] 2 4
A list can store different types of data with variable length
$Alphabet
[1] "A" "B" "C" "D" "E" "F" "G" "H" "I" "J" "K" "L" "M" "N" "O" "P" "Q" "R" "S"
[20] "T" "U" "V" "W" "X" "Y" "Z"
$Numbers
[1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
[19] 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
[37] 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54
[55] 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72
[73] 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90
[91] 91 92 93 94 95 96 97 98 99 100
Elements in a list can be accessed with $
[1] "A" "B" "C" "D" "E" "F" "G" "H" "I" "J" "K" "L" "M" "N" "O" "P" "Q" "R" "S"
[20] "T" "U" "V" "W" "X" "Y" "Z"
Data frames are lists where all elements have the same length
Access with $
Functions take an input and can return an output
[1] 41.66084 45.13191 41.75570 38.12088 42.38217 43.05238
Summary function:
If you want/have to then you can write your own functions
If you want/have to then you can write your own functions
If you want/have to then you can write your own functions
If you want/have to then you can write your own functions
Make a sequence:
[1] 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
[1] 0.00 0.25 0.50 0.75 1.00
Make a repeats:
[1] "A" "A" "A" "B" "B" "B"
[1] "A" "B" "A" "B" "A" "B"
[1] "A" "A" "A" "A" "A"
Simulate!
Simulations can be very useful to check analysis or to get a feeling for the data (Or you have a simulation in mind)
[1] -0.3342374 -0.6652091 -0.1018155 -0.5427938
[1] 0 0 0 0 0 0 1 1 0 0
For more distributions check ?Distributions
Can we apply a function to a all rows or columns of a matrix?
Use apply!
MARGIN: use function on rows (1) or columns (2)apply will try to “simplify” output by default (as 1D vector or 2D matrix)listOther apply functions:
lapply: uses list or vector → outputs listsapply: uses vectors → outputs vector, matrix or listFind files in folders with list.files
[1] "DataFile1.csv" "DataFile2.csv" "DataFile3.csv" "DataFile4.csv"
[5] "DataFile5.csv" "File1.csv" "File2.csv" "File3.csv"
[9] "File4.csv" "File5.csv" "SubDirectory"
Output the whole folder
[1] "Data/FileDirectory/DataFile1.csv" "Data/FileDirectory/DataFile2.csv"
[3] "Data/FileDirectory/DataFile3.csv" "Data/FileDirectory/DataFile4.csv"
[5] "Data/FileDirectory/DataFile5.csv" "Data/FileDirectory/File1.csv"
[7] "Data/FileDirectory/File2.csv" "Data/FileDirectory/File3.csv"
[9] "Data/FileDirectory/File4.csv" "Data/FileDirectory/File5.csv"
[11] "Data/FileDirectory/SubDirectory"
Find files in folders with list.files
Output only specific files with "Data" in the name
[1] "Data/FileDirectory/DataFile1.csv" "Data/FileDirectory/DataFile2.csv"
[3] "Data/FileDirectory/DataFile3.csv" "Data/FileDirectory/DataFile4.csv"
[5] "Data/FileDirectory/DataFile5.csv"
Output files also in folders inside the folder
[1] "Data/FileDirectory/DataFile1.csv"
[2] "Data/FileDirectory/DataFile2.csv"
[3] "Data/FileDirectory/DataFile3.csv"
[4] "Data/FileDirectory/DataFile4.csv"
[5] "Data/FileDirectory/DataFile5.csv"
[6] "Data/FileDirectory/File1.csv"
[7] "Data/FileDirectory/File2.csv"
[8] "Data/FileDirectory/File3.csv"
[9] "Data/FileDirectory/File4.csv"
[10] "Data/FileDirectory/File5.csv"
[11] "Data/FileDirectory/SubDirectory/SubDirFile1.csv"
[12] "Data/FileDirectory/SubDirectory/SubDirFile2.csv"
[13] "Data/FileDirectory/SubDirectory/SubDirFile3.csv"
Measurement depends on (~) Group (Measurement ~ Group)Measurement depends on Group and Time (Measurement ~ Group + Time)Measurement depends on Group and Time and its interaction (Measurement ~ Group*Time)
Measurement ~ Group + Time + Group:Timerticle,bookdown)Every time you write R code you can reuse it
Just update the data set and you will get the whole analysis in seconds and reproducible
Make graphs which you don’t have to modify in illustrator etc.
A project should start with a New Project
Keep track of changes with version control (Git) in RStudio
Git and keep track2Packages)It depends…
packagePackages)It depends…
packagePackages)It depends…
packageggplot2 and data.tablePackage I should install?Get the data sets and the code for the examples today:
The string (in quotation marks "...") is the file path
# A tibble: 31 × 5
Slice Group Pulse1 Pulse2 EventsAfterEPSP
<dbl> <chr> <dbl> <dbl> <dbl>
1 1 WT 1.42 2.7 1
2 2 WT 0.78 0.87 2
3 3 WT 0.96 1.64 2
4 4 WT 0.64 1.35 2
5 5 WT 0.92 1.88 2
6 6 WT 0.76 1.21 1
7 7 WT 1.67 3.09 0
8 8 WT 0.93 0.89 1
9 9 WT 0.64 1.58 3
10 10 WT 0.57 1.25 2
# … with 21 more rows
Slice Group Pulse1 Pulse2
Min. : 1.0 Length:31 Min. :0.480 Min. :0.470
1st Qu.: 8.5 Class :character 1st Qu.:0.755 1st Qu.:1.160
Median :16.0 Mode :character Median :0.960 Median :1.450
Mean :16.0 Mean :1.014 Mean :1.692
3rd Qu.:23.5 3rd Qu.:1.255 3rd Qu.:2.155
Max. :31.0 Max. :1.670 Max. :3.220
EventsAfterEPSP
Min. :0.000
1st Qu.:0.500
Median :2.000
Mean :1.516
3rd Qu.:2.000
Max. :4.000
Slice Group Pulse1 Pulse2 EventsAfterEPSP
1: 1 WT 1.42 2.70 1
2: 2 WT 0.78 0.87 2
3: 3 WT 0.96 1.64 2
4: 4 WT 0.64 1.35 2
5: 5 WT 0.92 1.88 2
6: 6 WT 0.76 1.21 1
7: 7 WT 1.67 3.09 0
8: 8 WT 0.93 0.89 1
9: 9 WT 0.64 1.58 3
10: 10 WT 0.57 1.25 2
11: 11 WT 1.40 2.81 2
12: 12 WT 1.28 2.67 0
13: 13 WT 1.24 2.19 4
14: 14 WT 1.10 2.12 1
15: 15 WT 1.27 3.22 2
16: 16 KO 0.72 1.14 0
17: 17 KO 0.86 0.47 1
18: 18 KO 1.64 2.78 4
19: 19 KO 1.56 2.12 4
20: 20 KO 1.06 1.52 2
21: 21 KO 1.05 1.35 0
22: 22 KO 0.83 1.40 0
23: 23 KO 0.48 0.65 2
24: 24 KO 0.68 1.00 2
25: 25 KO 1.10 1.45 0
26: 26 KO 1.18 2.28 0
27: 27 KO 1.30 2.05 2
28: 28 KO 0.77 1.14 2
29: 29 KO 0.75 1.39 1
30: 30 KO 0.72 1.06 2
31: 31 KO 1.14 1.18 0
Slice Group Pulse1 Pulse2 EventsAfterEPSP
data.table again because we still have it active from before
DT[i,j,by]
:=V1) by treatment variable.() is the same as list() Slice Group Pulse1 Pulse2 EventsAfterEPSP PPR
1: 1 WT 1.42 2.70 1 1.901408
2: 2 WT 0.78 0.87 2 1.115385
---
30: 30 KO 0.72 1.06 2 1.472222
31: 31 KO 1.14 1.18 0 1.035088
data.table again because we still have it active from before:= means write/add column3ggplot(data = DataPPR, aes(x = Group, y = PPR, colour = Group))+
geom_beeswarm(size=4, alpha=0.5, cex = 6, priority = "ascending")+
scale_y_continuous(name = "PPR")+
scale_x_discrete(name = "")+
scale_colour_manual(values = c("WT" = "black", "KO" = "red"), name = "")+
theme_classic()+
theme(legend.position = "None")ggplot(data = DataPPR, aes(x = Group, y = PPR, colour = Group))+
geom_beeswarm(size=4, alpha=0.5, cex = 6)+
scale_y_continuous(name = "PPR")+
scale_x_discrete(name = "", position = "top")+
scale_colour_manual(values = c("WT" = "black", "KO" = "red"), name = "")+
theme_classic()+
theme(legend.position = "None", axis.line.x = element_blank(),
axis.ticks.x = element_blank())Wide:
| Slice | Group | Pulse1 | Pulse2 |
|---|---|---|---|
| 1 | WT | 1.42 | 2.70 |
| 2 | WT | 0.78 | 0.87 |
| 3 | WT | 0.96 | 1.64 |
| 4 | WT | 0.64 | 1.35 |
| 5 | WT | 0.92 | 1.88 |
Long:
| Slice | Group | Pulse | Amplitude |
|---|---|---|---|
| 1 | WT | Pulse1 | 1.42 |
| 2 | WT | Pulse1 | 0.78 |
| 3 | WT | Pulse1 | 0.96 |
| 4 | WT | Pulse1 | 0.64 |
| 5 | WT | Pulse1 | 0.92 |
| 1 | WT | Pulse2 | 2.70 |
| 2 | WT | Pulse2 | 0.87 |
| 3 | WT | Pulse2 | 1.64 |
| 4 | WT | Pulse2 | 1.35 |
| 5 | WT | Pulse2 | 1.88 |
Wide:
| Slice | Group | Pulse1 | Pulse2 |
|---|---|---|---|
| 1 | WT | 1.42 | 2.70 |
| 2 | WT | 0.78 | 0.87 |
| 3 | WT | 0.96 | 1.64 |
| 4 | WT | 0.64 | 1.35 |
| 5 | WT | 0.92 | 1.88 |
Long:
| Slice | Group | Pulse | Amplitude |
|---|---|---|---|
| 1 | WT | Pulse1 | 1.42 |
| 2 | WT | Pulse1 | 0.78 |
| 3 | WT | Pulse1 | 0.96 |
| 4 | WT | Pulse1 | 0.64 |
| 5 | WT | Pulse1 | 0.92 |
| 1 | WT | Pulse2 | 2.70 |
| 2 | WT | Pulse2 | 0.87 |
| 3 | WT | Pulse2 | 1.64 |
| 4 | WT | Pulse2 | 1.35 |
| 5 | WT | Pulse2 | 1.88 |
DataPPR:
| Slice | Group | Pulse1 | Pulse2 |
|---|---|---|---|
| 1 | WT | 1.42 | 2.70 |
| 2 | WT | 0.78 | 0.87 |
| 3 | WT | 0.96 | 1.64 |
| 4 | WT | 0.64 | 1.35 |
| 5 | WT | 0.92 | 1.88 |
DT:
| Slice | Group | Pulse | Amplitude |
|---|---|---|---|
| 1 | WT | Pulse1 | 1.42 |
| 2 | WT | Pulse1 | 0.78 |
| 3 | WT | Pulse1 | 0.96 |
| 4 | WT | Pulse1 | 0.64 |
| 5 | WT | Pulse1 | 0.92 |
| 1 | WT | Pulse2 | 2.70 |
| 2 | WT | Pulse2 | 0.87 |
| 3 | WT | Pulse2 | 1.64 |
| 4 | WT | Pulse2 | 1.35 |
| 5 | WT | Pulse2 | 1.88 |
Change or add columns
gsub(pattern = "Pulse", replacement = "", x = "Pulse1") → "1"
Slice Group Pulse Amplitude
1: 1 WT 1 1.42
2: 2 WT 1 0.78
3: 3 WT 1 0.96
4: 4 WT 1 0.64
5: 5 WT 1 0.92
---
58: 27 KO 2 2.05
59: 28 KO 2 1.14
60: 29 KO 2 1.39
61: 30 KO 2 1.06
62: 31 KO 2 1.18
The plot structure is done!
Can we make it pretty though?
ggplot(data = DT, mapping = aes(x = Pulse, y = Amplitude, group = Slice, colour = Group))+
geom_point(alpha=0.5, size=4)+
geom_line()+
facet_wrap(facets = ~ Group)+
scale_y_continuous(name = "Amplitude (mV)", limits = c(0,5), expand = c(0,0))+
scale_x_discrete(name = "Pulse Number")+
scale_colour_manual(values = c("WT" = "black", "KO" = "red"), name = "")ggplot(data = DT, mapping = aes(x = Pulse, y = Amplitude, group = Slice, colour = Group))+
geom_point(alpha=0.5, size=4)+
geom_line()+
facet_wrap(facets = ~ Group)+
scale_y_continuous(name = "Amplitude (mV)", limits = c(0,5), expand = c(0,0))+
scale_x_discrete(name = "Pulse Number")+
scale_colour_manual(values = c("WT" = "black", "KO" = "red"), name = "")+
theme_classic()ggplot(data = DT, mapping = aes(x = Pulse, y = Amplitude, group = Slice, colour = Group))+
geom_point(alpha=0.5, size=4)+
geom_line()+
facet_wrap(facets = ~ Group)+
scale_y_continuous(name = "Amplitude (mV)", limits = c(0,5), expand = c(0,0))+
scale_x_discrete(name = "Pulse Number")+
scale_colour_manual(values = c("WT" = "black", "KO" = "red"), name = "")+
theme_classic()+
theme(strip.background = element_blank())AmplitudePlot <- ggplot(data = DT, aes(x = as.factor(Pulse), y = Amplitude, group = Slice, colour = Group))+
geom_point(alpha=0.5, size=4)+
geom_line()+
facet_wrap(facets = ~ Group)+
scale_y_continuous(name = "Amplitude (mV)", limits = c(0,5), expand = c(0,0))+
scale_x_discrete(name = "Pulse Number")+
scale_colour_manual(values = c("WT" = "black", "KO" = "red"), name = "")+
theme_classic()+
theme(strip.background = element_blank(), strip.placement = "outside")DT:
| Slice | Group | Pulse | Amplitude |
|---|---|---|---|
| 1 | WT | 1 | 1.42 |
| 2 | WT | 1 | 0.78 |
| 3 | WT | 1 | 0.96 |
| 4 | WT | 1 | 0.64 |
| 5 | WT | 1 | 0.92 |
| 1 | WT | 2 | 2.70 |
| 2 | WT | 2 | 0.87 |
| 3 | WT | 2 | 1.64 |
| 4 | WT | 2 | 1.35 |
| 5 | WT | 2 | 1.88 |
PPRTable:
| Slice | Group | 1 | 2 |
|---|---|---|---|
| 1 | WT | 1.42 | 2.70 |
| 2 | WT | 0.78 | 0.87 |
| 3 | WT | 0.96 | 1.64 |
| 4 | WT | 0.64 | 1.35 |
| 5 | WT | 0.92 | 1.88 |
PPRPlot <- ggplot(data = PPRTable, aes(x = Group,
y = PPR,
colour = Group))+
ggbeeswarm::geom_beeswarm(size=4, alpha=0.5, cex = 6)+
scale_y_continuous(name = "PPR")+
scale_x_discrete(name="", position = "top")+
scale_colour_manual(values = c("WT" = "black", "KO" = "red"), name = "")+
theme_classic()+
theme(legend.position = "None", axis.line.x = element_blank(), axis.ticks.x = element_blank())patchworkpatchworkpatchworkpatchworkpatchworkChange tag annotation (can also recognise nested plots: A, B1, B2)
Change layout based on dimensions or with “design matrix”
patchworkpatchworklibrary(patchwork)
CompletePanel <- AmplitudePlot + PPRPlot +
plot_annotation(tag_levels = "A") +
plot_layout(widths = c(2,1), guides='collect') &
theme(axis.text = element_text(size=16, colour = "black"),
strip.text = element_text(size=16),
axis.title = element_text(size=18),
plot.tag = element_text(size=22))You can save files as: eps, ps, tex (pictex), pdf, jpeg, tiff, png, bmp, svg or wmf (windows only)
Units: in, cm, mm, or px
It is impossible to know everything, but you will find everything online or in R itself!
type ? before a function, use help(), or the help tab
great forum to find solutions stackoverflow.com
for ggplot2 ggplot2.tidyverse.org
for other packages look for “packagename vignette”
find cheat sheets! (Help -> Cheat Sheets)
Google: there is likely no problem which can’t be found
For now we just considered normally distributed data
But things are not as often normal/gaussian as we think…
Often parameters can be normally distributed (e.g. the mean) but not the data itself.
But there is a whole new world!
There is a large set of distributions: - beta distribution
Don’t worry about any of these. R will take care of it automatically
can be a personal repository
No copy of data necessary compared to using =